Repetition¶
In this section, we will learn about source separation approaches that exploit a common feature of musical signals: repetition. In doing so, we will gain some understanding of the mechanics of source separation and how the an algorithm can assumptions about a signal to separate
In this section, we will explore three algorithms that attempt to separate a repeating background from a non-repeating foreground. The basic assumption here is 1) that there is repetition in the mixture, and 2) the repetition captures what we want to separate. This assumption holds quite well if we want to separate a singer from a backing band, but might not work if we want to isolate a drum set from the rest of the band because the drum set is usually playing a repeating pattern.
REPET¶
The first algorithm we will explore here is called the REpeating Patern Extraction Technique or REPET [RP12]. REPET works like this:
Find a repeating period, \(t_r\) seconds (e.g., the number of seconds which a chord progression might start over).
Segment the spectrogram into \(N\) segments, each with \(t_r\) seconds in length.
“Overlay” those \(N\) segments.
Take the median of those \(N\) stacked segments and make a mask of the median values.
We’ll use REPET to demonstrate how to run a source separation algorithm in nussl.
%%capture
!pip install git+https://github.com/source-separation/tutorial
# Do our imports
import nussl
import matplotlib.pyplot as plt
from common import viz
Let’s download an audio file that has a lot of repetition in it, and inspect and listen to it:
audio_path = nussl.efz_utils.download_audio_file('historyrepeating_7olLrex.wav', verbose=False)
history = nussl.AudioSignal(audio_path)
history.embed_audio()
plt.figure(figsize=(10, 3))
nussl.utils.visualize_spectrogram(history)
plt.title(str(history))
plt.tight_layout()
plt.show()
Now we need to instantiate a Repet object in nussl. We can do that like so:
repet = nussl.separation.primitive.Repet(history)
Now the repet object has our AudioSignal, it’s easy to run the algorithm:
repet.run()
[<nussl.core.masks.soft_mask.SoftMask at 0x7f63f8c35410>,
<nussl.core.masks.soft_mask.SoftMask at 0x7f63fa597550>]
Oh, look! The repet object returned masks! We can get audio signals back by doing the following:
r_estimates = repet.make_audio_signals()
We can also chain both of those operations if we don’t care about the intermediate steps:
r_estimates = repet()
Let’s check out the masks that repet made:
viz.show_sources(r_estimates)
And there are our foreground and background sources!
Making it Interactive¶
nussl has hooks for gradio, so we can make our repet object interactive. All algorithms in nussl have this ability.
repet.interact()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
108 try:
--> 109 import gradio
110 except: # pragma: no cover
ModuleNotFoundError: No module named 'gradio'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
<ipython-input-9-a9787b01b3d8> in <module>
----> 1 repet.interact()
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
110 except: # pragma: no cover
111 raise ImportError(
--> 112 "To use this functionality, you must install gradio: "
113 "pip install gradio.")
114
ImportError: To use this functionality, you must install gradio: pip install gradio.
Go ahead and play around with REPET. See what types of audio work and what types of audio doesn’t work. How does it work on electronic loops? How does it work on ambient music?
Review¶
The process of running a separation algorithm in nussl was only a few steps:
Instantiate a separation object with an audio signal. E.g.,
repet = nussl.separation.primitive.Repet(history)Run the object to get the results. E.g.
repet()
Now let’s look at a few other algorithms that leverage repetition in a musical recording and compare results to REPET.
REPET-SIM¶
REPET-SIM is a variant of REPET that doesn’t rely on a fixed repeating period. In fact, it doesn’t rely on repetition as explicitly as REPET does. REPET-SIM calculates a similarity matrix between each pair of spectral frames in an STFT, selects the \(k\) nearest nieghbors for each frame, and makes a mask by median filtering the bins for each of the selected neighbors.
We can run REPET-SIM the same way we can run REPET:
repet_sim = nussl.separation.primitive.RepetSim(history)
rs_estimates = repet_sim()
viz.show_sources(rs_estimates)
And let’s make an interactive one as well:
repet_sim.interact()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
108 try:
--> 109 import gradio
110 except: # pragma: no cover
ModuleNotFoundError: No module named 'gradio'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
<ipython-input-11-f17a36515064> in <module>
----> 1 repet_sim.interact()
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
110 except: # pragma: no cover
111 raise ImportError(
--> 112 "To use this functionality, you must install gradio: "
113 "pip install gradio.")
114
ImportError: To use this functionality, you must install gradio: pip install gradio.
2DFT¶
We can also use a Two-dimensional Fourier Transform (2DFT) of a spectrogram to find repeating and non-repeating patterns. Repeating sections show up as peaks in the 2DFT and non-repeating parts are everything else. We can use a peak picker to separate the repeating from non repeating parts. That’s what this algorithm does:
# We can't start a variable name with a number,
# so this object is called FT2D
ft2d = nussl.separation.primitive.FT2D(history)
ft2d_estimates = ft2d()
viz.show_sources(ft2d_estimates)
And let’s make 2DFT interactive too:
ft2d.interact()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
108 try:
--> 109 import gradio
110 except: # pragma: no cover
ModuleNotFoundError: No module named 'gradio'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
<ipython-input-13-6011817adf0d> in <module>
----> 1 ft2d.interact()
/opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages/nussl/separation/base/separation_base.py in interact(self, add_residual, source, label, share)
110 except: # pragma: no cover
111 raise ImportError(
--> 112 "To use this functionality, you must install gradio: "
113 "pip install gradio.")
114
ImportError: To use this functionality, you must install gradio: pip install gradio.
Harmonic-Percussive Source Separation (HPSS)¶
If you spend enough time visualizing musical signals on a spectrogram, you start to notice that harmonic sounds look similar horizontal stripes on a spectrogram and percussive sounds look similar to vertical stripes. Harmonic-Percussive Source Separation takes advantage of this insight by applying a median filter accross frequency bins (horizontal, or harmonic) and across time bins (vertical, or percussive) to make a mask:
hpss = nussl.separation.primitive.HPSS(history)
hpss_estimates = hpss()[::-1]
# hpss gives harmonic then percussive
# so let's reverse the order of the list
visualize_and_embed(hpss_estimates)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-14-d445a7349b06> in <module>
3 # hpss gives harmonic then percussive
4 # so let's reverse the order of the list
----> 5 visualize_and_embed(hpss_estimates)
NameError: name 'visualize_and_embed' is not defined
Next Steps…¶
There you have it. Four simple algorithms to separate repeating and non-repeating parts and also harmonic and percussive parts.
Next we’ll talk about how we can model timbre using Non-negative Matrix Factorization.